Machine Reading of Biological Texts - Bacteria-Biotope Extraction
نویسندگان
چکیده
The tremendous amount of scientific literature available about bacteria and their biotopes underlines the need for efficient mechanisms to automatically extract this information. This paper presents a system to extract the bacteria and their habitats, as well as the relations between them. We investigate to what extent current techniques are suited for this task and test a variety of models in this regard. To detect entities in a biological text we use a linear chain Conditional Random Field (CRF). For the prediction of relations between the entities, a model based on logistic regression is built. Designing a system upon these techniques, we explore several improvements for both the generation and selection of good candidates. One contribution to this lies in the extended flexibility of our ontology mapper, allowing for a more advanced boundary detection. Furthermore, we discover value in the combination of several distinct candidate generation rules. Using these techniques, we show results that are significantly improving upon the state of art for the BioNLP Bacteria Biotopes task.
منابع مشابه
BioNLP 2011 Task Bacteria Biotope - The Alvis system
This paper describes the system of the INRA Bibliome research group applied to the Bacteria Biotope (BB) task of the BioNLP 2011 shared tasks. Bacteria, geographical locations and host entities were processed by a pattern-based approach and domain lexical resources. For the extraction of environment locations, we propose a framework based on semantic analysis supported by an ontology of the bio...
متن کاملOverview of the Bacteria Biotope Task at BioNLP Shared Task 2016
This paper presents the Bacteria Biotope task of the BioNLP Shared Task 2016, which follows the previous 2013 and 2011 editions. The task focuses on the extraction of the locations (biotopes and geographical places) of bacteria from PubMed abstracts and the characterization of bacteria and their associated habitats with respect to reference knowledge sources (NCBI taxonomy, OntoBiotope ontology...
متن کاملIRISA participation to BioNLP-ST13: lazy-learning and information retrieval for information extraction tasks
This paper describes the information extraction techniques developed in the framework of the participation of IRISATexMex to the following BioNLP-ST13 tasks: Bacterial Biotope subtasks 1 and 2, and Graph Regulation Network. The approaches developed are general-purpose ones and do not rely on specialized preprocessing, nor specialized external data, and they are expected to work independently of...
متن کاملBioNLP Shared Task 2011 - Bacteria Biotope
This paper presents the Bacteria Biotope task as part of the BioNLP Shared Tasks 2011. The Bacteria Biotope task aims at extracting the location of bacteria from scientific Web pages. Bacteria location is a crucial knowledge in biology for phenotype studies. The paper details the corpus specification, the evaluation metrics, summarizes and discusses the participant results.
متن کاملBioNLP shared Task 2013 - An Overview of the Bacteria Biotope Task
This paper presents the Bacteria Biotope task of the BioNLP Shared Task 2013, which follows BioNLP-ST-11. The Bacteria Biotope task aims to extract the location of bacteria from scientific web pages and to characterize these locations with respect to the OntoBiotope ontology. Bacteria locations are crucial knowledge in biology for phenotype studies. The paper details the corpus specifications, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015